Search CORE

146 research outputs found

Dispersion for Data-Driven Algorithm Design, Online Learning, and Private Optimization

Author: Balcan Maria-Florina
Dick Travis
Vitercik Ellen
Publication venue
Publication date: 22/10/2018
Field of study

Data-driven algorithm design, that is, choosing the best algorithm for a specific application, is a crucial problem in modern data science. Practitioners often optimize over a parameterized algorithm family, tuning parameters based on problems from their domain. These procedures have historically come with no guarantees, though a recent line of work studies algorithm selection from a theoretical perspective. We advance the foundations of this field in several directions: we analyze online algorithm selection, where problems arrive one-by-one and the goal is to minimize regret, and private algorithm selection, where the goal is to find good parameters over a set of problems without revealing sensitive information contained therein. We study important algorithm families, including SDP-rounding schemes for problems formulated as integer quadratic programs, and greedy techniques for canonical subset selection problems. In these cases, the algorithm's performance is a volatile and piecewise Lipschitz function of its parameters, since tweaking the parameters can completely change the algorithm's behavior. We give a sufficient and general condition, dispersion, defining a family of piecewise Lipschitz functions that can be optimized online and privately, which includes the functions measuring the performance of the algorithms we study. Intuitively, a set of piecewise Lipschitz functions is dispersed if no small region contains many of the functions' discontinuities. We present general techniques for online and private optimization of the sum of dispersed piecewise Lipschitz functions. We improve over the best-known regret bounds for a variety of problems, prove regret bounds for problems not previously studied, and give matching lower bounds. We also give matching upper and lower bounds on the utility loss due to privacy. Moreover, we uncover dispersion in auction design and pricing problems

arXiv.org e-Print Archive

Crossref

Subset-Based Instance Optimality in Private Estimation

Author: Dick Travis
Kulesza Alex
Sun Ziteng
Suresh Ananda Theertha
Publication venue
Publication date: 15/08/2023
Field of study

We propose a new definition of instance optimality for differentially private estimation algorithms. Our definition requires an optimal algorithm to compete, simultaneously for every dataset

D

, with the best private benchmark algorithm that (a) knows

D

in advance and (b) is evaluated by its worst-case performance on large subsets of

D

. That is, the benchmark algorithm need not perform well when potentially extreme points are added to

D

; it only has to handle the removal of a small number of real data points that already exist. This makes our benchmark significantly stronger than those proposed in prior work. We nevertheless show, for real-valued datasets, how to construct private algorithms that achieve our notion of instance optimality when estimating a broad class of dataset properties, including means, quantiles, and

\ell_p

-norm minimizers. For means in particular, we provide a detailed analysis and show that our algorithm simultaneously matches or exceeds the asymptotic performance of existing algorithms under a range of distributional assumptions

arXiv.org e-Print Archive